Structured Query Language (SQL) was developed to work with relational databases that organize and store information in groups of columns and rows, called tables. They are “relational” because of relations linking data between different tables (think: Excel).
SQL has lasted for decades, and it’s become the industry standard because of its ease, versatility, and power in working with transactional data. SQL is easy to learn for beginners, and can take developers far in their careers. It’s still the best language for defining data architecture, and it remains more popular among data engineers and scientists than languages like Python or R.
As a declarative language, it’s concise and leaves the heavy lifting up to the database system. This is a bit different from the way things work in imperative languages, where we give the system the exact steps for what we want to do (this StackOverflow thread explains declarative vs imperative languages.
Learning the fundamentals and logic of working with data through SQL gives developers a strong foundation for moving to newer data-oriented tools, like Spark, and functional languages, like Scala.
Databases are a key element to building modern apps, and one of the best ways to learn is by doing. Let’s get hands-on with SQL! In these SQL tutorials, we will run SQL queries manually, but they’re the same types of queries that act as the building blocks behind application programming interfaces (APIs) used in building apps.
A relational database is a database that exhibits two key SQL concepts: tables and data relations (hence the name “relational database”). Each table consists of rows and columns, and the configuration of the tables is known as its “schema.”
For example, here’s a graphical diagram of the schema of two tables we’ll use in this guide. Below each table is the list of the columns for that table, along with their corresponding data type. The two tables are related by the id column of the Animals table.
A helpful way to think about tables is to view them as a spreadsheet with columns for the information we want to track and rows for each data entry we want to store. We can create multiple sheets with different types of information inside each sheet, and then use SQL queries to link and work with the data across all of them.
SQL consists of three different types of underlying groups:
DDL allows us to define what the structure of our databases looks like using commands such as CREATE and ALTER. We can imagine it as setting up and labeling shelves for our data and specifying how we want to organize it before moving and working with it.
DML provides the methods for how to manipulate the data to actually do the adding, changing, and deleting through commands like SELECT, INSERT, UPDATE, and DELETE.
DCL/TCL enables us to specify who controls our databases with rights and permissions.
Lastly, there are also utility functions that provide us information, such as showing a list of tables or user permissions.
Let’s pretend we are starting a new animal shelter called Animals 4 Homes, and go through its journey by helping set up its database as it grows and expands.
To follow along with this guide, you’ll need a SQL environment. If you would like an easy way to connect and use a SQL database without any installation or setup, you can use this SQL Playground. Note that the steps in this guide are not specific to CockroachDB, and the knowledge from this guide can be used with PostgreSQL, MySQL, and other relational databases.
All right! Animals 4 Homes is getting ready to open and help dogs find loving homes.
To get started, we need to set up a SQL database to store and manage each dog as they come in and get adopted.
First, we need to create a database for our data. We’ll name it pet_adoption using the following SQL statement:
CREATE DATABASE pet_adoption;
Creating a database doesn’t automatically set it as the active database, so now let’s select the pet_adoption database with the USE
command:
USE pet_adoption;
Next, we need to set up the tables that will store our data.
This is a key step in working with SQL because this step is where we define what information we will store and how it’s organized. Even though we can change or “alter” the table later, configuring it well from the start can keep our database running smoothly without the need for us to shuffle things around later.
For this project, let’s create just two tables: animals and adoptions.
The first table, animals, will maintain a list of the dogs that come through our shelter.
We’ll call this table “animals” instead of “dogs” because it’s likely that an animal shelter could expand to more types of animals than just dogs (psst, we will do this in the upcoming steps). Naming it as such from the start will keep us from needing to rename the table in the future.
The information we might want to have in this table are the dog’s unique ID, its name, breed, color, sex, and adoption status. Let’s set up the columns for them accordingly.
Create the animal table with the following command:
CREATE TABLE animals (id UUID NOT NULL, name STRING, breed STRING, color STRING, gender STRING, status INTEGER);
We’re going to assign a Universally Unique Identifier (UUID) as the id for each animal registered into the database. If you aren’t familiar with UUIDs, it is a uniquely generated combination of 32 hexadecimal numbers. For example, one UUID might look like this: 123e4567-e89b-12d3-a456-426614174000. It’s usually generated in code, but you can also use an online generator to get one manually.
The other interesting column in our table is status, where we will store a 0 or 1 integer for each animal to represent its adoption status. This number value is arbitrary, but let’s say that 0 = open for adoption and 1 = adopted. Using an integer for this column instead of a text string will help keep the overall database size smaller. Simultaneously, this will make our SQL queries faster because numerical operations are easier and quicker to compute than text operations.
The second table will be named adoptions. It will be used to track all adoption transactions.
Each adoption record will include an animal_id UUID that we will add to match the UUID value in the id field that we set for the animal in the first table. In this guide, we’ll copy the UUID value manually into both tables, but often it’s generated with code. This table will also contain the name and contact info of the person adopting the animal and the date of adoption.
Create this adoptions table using the following command:
CREATE TABLE adoptions (animal_id UUID NOT NULL, name STRING, contact STRING, date TIMESTAMP);
We added a date column with the TIMESTAMP data type in this table, which will enable us to sort by the most recent adoptions or to search for adoptions between a specific timeframe.
While the design above for our database is just one option of many other possible variations, keep in mind that we can organize the same information in different ways. We can add as many tables, columns, and details as necessary and appropriate. However, keeping things simple with fewer tables and columns is generally good for both database complexity and speed.
To give an example, we could create a table for each breed of dog, but that could quickly grow to over a hundred tables and make it difficult for us to search for the right dogs. At the same time, if we have too little detail and don’t track the dog’s breed at all, we wouldn’t be able to specify the type of dog potential adopters are looking for using our SQL queries. This would limit the usefulness of the database.
Before we move forward, let’s make sure that our tables were created properly in the database.
Run this command to get the list of tables in the current database and check that we have both the animals table and the adoptions table:
SHOW TABLES;
If you see both tables, then you can run these two statements to make sure that the columns of each are correct:
SHOW COLUMNS FROM animals;
SHOW COLUMNS FROM adoptions;
Now that our database schema looks good, we’re ready to start accepting dogs at our shelter.
It’s time to welcome and register the first dogs coming to Animals 4 Homes!
They are eager to check in and look cute to visitors, so let’s add them to the system quickly. We can do this using an INSERT statement on the animals table that looks like the following:
INSERT INTO animals (id, name, breed, color, gender, status) VALUES ('89354034-20d9-4c3d-8195-3294bfd9dbc5', 'Bellyflop', 'Beagle', 'Brown', 'Male', 0);
Here is the list of all of the dogs waiting in line. Try putting them into the system based on the above SQL statement.
Did you get it? The SQL INSERT statements should look like this:
INSERT INTO animals (id, name, breed, color, gender, status) VALUES ('89354034-20d9-4c3d-8195-3294bfd9dbc5', 'Bellyflop', 'Beagle', 'Brown', 'Male', 0);
INSERT INTO animals (id, name, breed, color, gender, status) VALUES ('ae91cf1c-f972-42f3-8160-6c04d935699c', 'Snowy', 'Husky', 'White', 'Female', 0);
INSERT INTO animals (id, name, breed, color, gender, status) VALUES ('37df3388-b0f4-4f0d-b6ef-0d840923a4d8', 'Princess', 'Pomeranian', 'Black', 'Female', 0);
INSERT INTO animals (id, name, breed, color, gender, status) VALUES ('94545432-d27a-4ac8-ab7c-38270d7535f3', 'Cricket', 'Chihuahua', 'Brown', 'Male', 0);
INSERT INTO animals (id, name, breed, color, gender, status) VALUES ('a1e7a7fc-b429-41ec-9924-8bb39dd397c8', 'Princess', 'Poodle', 'Purple', 'Female', 0);
INSERT INTO animals (id, name, breed, color, gender, status) VALUES ('5138ed53-2ab2-400b-973c-91186f8c673d', 'Spot', 'Dalmation', 'Black and White', 'Male', 0);
With the full list of dogs added to our database, we can try running some SELECT queries to look through them. The following are some small examples of possible SQL statements to run.
Get the full list of all properties of all dogs (defaults to a limit of 100 rows):
SELECT * FROM animals;
Get the breeds of all dogs:
SELECT breed FROM animals;
Get the names of only female dogs by including a WHERE clause:
SELECT name FROM animals WHERE gender = 'Female';
Get the IDs of dogs up for adoption:
SELECT id FROM animals WHERE status = 0;
For practice, test and get creative with your SQL statements to get a feel for using SELECT statements with WHERE clauses.
It turns out that some of the original info for our dogs needs correcting (since purple dogs don’t exist), so let’s look at how we can update and delete information in our database.
The best way to update a row is to use each dog’s unique id so that we can be sure of the specific single row that is changed. So, to correct Princess the Poodle’s color from Purple to Brown, we can use the following statement:
UPDATE animals SET color = 'Brown' WHERE id = 'a1e7a7fc-b429-41ec-9924-8bb39dd397c8';
(Remember, you can get the id of a dog using a SELECT statement.)
While it might seem easier and more natural to update the database using the dog’s name, we need to make sure only this one poodle’s color entry is changed. Since we have two dogs named “Princess” in our shelter — a pomeranian and a poodle — and we may get other dogs in the future also named “Princess,” a statement like the one below can cause problems because it will change all of their colors in the database to Brown:
UPDATE animals SET color = 'Brown' WHERE name = 'Princess';
However, another option that works is to specifically fix Purple color entries in the database since no dog should be purple colored.
UPDATE animals SET color = 'Brown' WHERE color = 'Purple';
While we were preparing to move onto the next section, the owner of one of the dogs came looking for his beagle and found him here. So let’s remove Bellyflop the Beagle from our database using a DELETE statement before someone tries to adopt him.
DELETE FROM animals WHERE id = '89354034-20d9-4c3d-8195-3294bfd9dbc5';
Great news! We found a home for Cricket the Chihuahua. Let’s record our first adoption ever for Animals 4 Homes in our database.
To do this, we’re going to need to run two different statements: one first to update Cricket’s status to 1 for “adopted” in the animals table, and another to insert the event into the adoptions table to make it final.
The first statement will look like this, using Cricket’s id field:
UPDATE animals SET status = 1 WHERE id = '94545432-d27a-4ac8-ab7c-38270d7535f3';
The adoption SQL statement should look like the following:
INSERT INTO adoptions (animal_id, name, contact, date) VALUES ('94545432-d27a-4ac8-ab7c-38270d7535f3', 'Pinocchio', 'realboy@cockroachlabs.com', NOW());
In our INSERT
statement, you might notice that we used NOW()
for the date value. This special SQL function automatically sets that value to the current time inside the server so that we don’t need to look up the time and write it down.
Two more of our dogs are going to a new home, so try writing the four SQL statements required to properly record the following adoptions from the list below:
Did you get it? The SQL statements should look like this:
UPDATE animals SET status = 1 WHERE id = 'a1e7a7fc-b429-41ec-9924-8bb39dd397c8';
INSERT INTO adoptions (animal_id, name, contact, date) VALUES ('a1e7a7fc-b429-41ec-9924-8bb39dd397c8', 'Patalie', 'poodlequeen@cockroachlabs.com', NOW());
UPDATE animals SET status = 1 WHERE id = '5138ed53-2ab2-400b-973c-91186f8c673d';
INSERT INTO adoptions (animal_id, name, contact, date) VALUES ('5138ed53-2ab2-400b-973c-91186f8c673d', 'Ella', 'ellacrew@cockroachlabs.com', NOW());
Now, let’s check to make sure that our queries ran correctly inside our database.
We should see the three new adoption entries when we run the following:
SELECT * FROM adoptions;
We can also view them sorted by most recent adoption (DESCending order of dates) using the ORDER BY modifier:
SELECT * FROM adoptions ORDER BY date DESC;
Finally, we should make sure that the status of the correct three dogs (Cricket the Chihuahua, Princess the Poodle, and Spot the Dalmatian) were set to 1 (“adopted” status).
Verify that you get three results when you run:
SELECT * FROM animals WHERE status = 1;
Our new animal shelter seems to be running well, thanks to the SQL adoptions system!
The organization has now decided to accept cats as well and find them new homes. Let’s update our database system to handle cats and potentially other species of animals in the future.
First, we will need to add a new species column to the animals table to track whether an animal entry is for a dog or a cat. This can be done using the ALTER TABLE command:
ALTER TABLE animals ADD COLUMN species STRING;
And we can verify that the table was properly updated using the SHOW COLUMNS command.
SHOW COLUMNS FROM animals;
Before we start adding any of our cats, we have a chance to easily set the species for all of our current animals in the database to “Dog” because we already know that there are only dogs in the system.
To do this, we first need to turn off safe updates so that we can use the UPDATE statement without a WHERE clause and update all rows in the table with one statement.
SET sql_safe_updates = FALSE;
Then run the following query to specify “Dog” for all of the current animals in the database:
UPDATE animals SET species = 'Dog';
Now, if we retrieve the animals table, we should see every row configured as a dog.
SELECT * FROM animals;
We can start adding our new cats into the system. Adding these cats can be done similarly to the INSERT INTO statements we used earlier to add the dogs. But we have to remember to add the species column and set the value to “Cat.”
Here is the list of cats we just took in. Try to add them to the database:
Did you get it? The SQL statements should look like this:
INSERT INTO animals (id, name, species, breed, color, gender, status) VALUES ('11d6fa07-449f-4053-a7cb-ae4ec8570f3f', 'Meowmix', 'Cat', 'Munchkin', 'Yellow', 'Female', 0);
INSERT INTO animals (id, name, species, breed, color, gender, status) VALUES ('4e55860a-ec39-494b-845a-2e0a6496bf9b', 'Ash', 'Cat', 'Persian', 'Gray', 'Female', 0);
INSERT INTO animals (id, name, species, breed, color, gender, status) VALUES ('e80b92e5-98ed-458d-885d-b9e05d0d123e', 'Tiger', 'Cat', 'Bengal', 'Brown', 'Male', 0);
Just to be sure, let’s check that the cats were added to the database properly.
SELECT * FROM animals;
Animals 4 Homes was such a big success that we are now expanding to other locations!
We want to continue using the same database system to keep track of all of the animals and handle all the adoptions, so we’re going to need to make some updates to the database.
All of the shelters will be tracked inside our database from now on. So, create a third table named shelters with columns for an ID, name, and location. Unlike the animals table, where we may eventually have thousands or even millions of animals in the database, we can use an integer for the id column of shelters instead of a UUID for simplicity.
CREATE TABLE shelters (id INTEGER, name STRING, location STRING);
Let’s add our current and first shelter, Animals 4 Homes, located in Red City:
INSERT INTO shelters (id, name, location) VALUES (1, 'Animals 4 Homes', 'Red City');
And next, we need to adjust the animals table with a new shelter column and while we have this chance, update all of the existing rows of animal data to be located in this shelter id = 1.
ALTER TABLE animals ADD COLUMN shelter INTEGER;
UPDATE animals SET shelter = 1;
Then we can create entries for the other two shelters joining our efforts for pet adoption: Adopt A Buddy and Fluffy Animals:
INSERT INTO shelters (id, name, location) VALUES (2, 'Adopt A Buddy', 'Green Town');
INSERT INTO shelters (id, name, location) VALUES (3, 'Fluffy Animals', 'Blue Hills');
Our new animal shelters are taking in some new animals who are looking for loving homes.
Here’s the latest list of animals for you to try adding to the database:
Did you get it? These are the SQL statements to add the new animals:
INSERT INTO animals (id, name, shelter, species, breed, color, gender, status) VALUES ('ac1a773d-6912-45cc-b296-1b58d13dbd32', 'Snoops', 2, 'Dog', 'Beagle', 'Brown', 'Male', 0);
INSERT INTO animals (id, name, shelter, species, breed, color, gender, status) VALUES ('56e9be70-ed24-4949-bb54-70329c2caf82', 'Salt', 2, 'Cat', 'Turkish Angora', 'White', 'Female', 0);
INSERT INTO animals (id, name, shelter, species, breed, color, gender, status) VALUES ('b91df9f1-8e3d-4cce-a4ba-9d1a3ad6a12b', 'Fuzz', 3, 'Dog', 'Papillon', 'Gray', 'Male', 0);
There are two more things we are going to do with our database: We’re going to create an index to help boost the performance of our SQL queries, and we’re going to run a JOIN statement across multiple tables to pull information across all of them into a single result. For a more indepth read on JOINS, pop over to this blog post JOIN Ordering: An Introduction
An index basically hints to the database that a particular column is important, and should be used to help sort and filter through the data. This tells the database tasks to add some indexing information each time some data is added or updated. This way, the whole database can run more efficiently, and queries through the data can be completed much faster. In our case, we can create an index for the shelter column in the animals table because that will be one of the key ways to divide up the data. For more experiments with indexes, pop over to Full Text Indexing and Search.
To do so, we can run this command:
CREATE INDEX animal_shelter ON animals (shelter);
Finally, using JOIN
, we can query across multiple tables and retrieve not only each animal’s information but their shelter in a single SQL statement instead of doing the work ourselves.
It looks a bit like this:
SELECT * FROM animals JOIN shelters ON animals.shelter = shelters.id;
The ON
clause above specifies that the shelter column in the animals table will match an id in the shelters table. That will be the link that makes the combined table query result possible.
Okay, what if we would like to get a list of the adopted animals from the first shelter? See if you can figure out what the statement should be. (HINT: WHERE
are you changing your statement?)
Did you get it? Here is the SQL statement:
SELECT * FROM adoptions JOIN animals ON adoptions.animal_id = animals.id WHERE animals.shelter = 1;
In the modern data-driven world, being able to work with databases effectively can be a sort of technological superpower. And now that you’ve learned how to use a relational database and write SQL statements to add, update, delete, and search through it, you now have the power to work with large amounts of data.
If you want to keep learning, here are a few resources to boost your skills:
And if you’re looking for a great place to practice and build upon the SQL skills you learned here, you can get started very easily on a forever-free database with zero installation or server setup. Sign up to get started.
Happy querying!
CockroachDB’s support for SQLAlchemy is currently in beta, but we’re actively developing new …
Read moreAs a developer, you are often faced with the question: why is my application slow? You may have already identified the …
Read moreIn two hours (the same amount of time it takes to binge two episodes of The Witcher) you will build a ride-sharing …
Read more